The search functionality is under construction.

Author Search Result

[Author] Kazuyoshi TAKAGI(25hit)

21-25hit(25hit)

  • A VLSI Algorithm for Division in GF(2m) Based on Extended Binary GCD Algorithm

    Yasuaki WATANABE  Naofumi TAKAGI  Kazuyoshi TAKAGI  

     
    PAPER

      Vol:
    E85-A No:5
      Page(s):
    994-999

    A VLSI algorithm for division in GF(2m) with the canonical basis representation is proposed. It is based on the extended Binary GCD algorithm for GF(2m), and performs division through iteration of simple operations, such as shifts and bitwise exclusive-OR operations. A divider in GF(2m) based on the algorithm has a linear array structure with a bit-slice feature and carries out division in 2m clock cycles. The amount of hardware of the divider is proportional to m and the depth is a constant independent of m.

  • A VLSI Architecture for Output Probability Computations of HMM-Based Recognition Systems with Store-Based Block Parallel Processing

    Kazuhiro NAKAMURA  Masatoshi YAMAMOTO  Kazuyoshi TAKAGI  Naofumi TAKAGI  

     
    PAPER-VLSI Systems

      Vol:
    E93-D No:2
      Page(s):
    300-305

    In this paper, a fast and memory-efficient VLSI architecture for output probability computations of continuous Hidden Markov Models (HMMs) is presented. These computations are the most time-consuming part of HMM-based recognition systems. High-speed VLSI architectures with small registers and low-power dissipation are required for the development of mobile embedded systems with capable human interfaces. We demonstrate store-based block parallel processing (StoreBPP) for output probability computations and present a VLSI architecture that supports it. When the number of HMM states is adequate for accurate recognition, compared with conventional stream-based block parallel processing (StreamBPP) architectures, the proposed architecture requires fewer registers and processing elements and less processing time. The processing elements used in the StreamBPP architecture are identical to those used in the StoreBPP architecture. From a VLSI architectural viewpoint, a comparison shows the efficiency of the proposed architecture through efficient use of registers for storing input feature vectors and intermediate results during computation.

  • High-Throughput Rapid Single-Flux-Quantum Circuit Implementations for Exponential and Logarithm Computation Using the Radix-2 Signed-Digit Representation

    Masamitsu TANAKA  Kazuyoshi TAKAGI  Naofumi TAKAGI  

     
    PAPER

      Vol:
    E99-C No:6
      Page(s):
    703-709

    We present circuit implementations for computing exponentials and logarithms suitable for rapid single-flux-quantum (RSFQ) logic. We propose hardware algorithms based on the sequential table-lookup (STL) method using the radix-2 signed-digit representation that achieve high-throughput, digit-serial calculations. The circuits are implemented by processing elements formed in systolic-array-like, regularly-aligned pipeline structures. The processing elements are composed of adders, shifters, and readouts of precomputed constants. The iterative calculations are fully overlapped, and throughputs approach the maximum throughput of serial processing. The circuit size for calculating significand parts is estimated to be approximately 5-10 times larger than that of a bit-serial floating-point adder or multiplier.

  • Exact Minimization of Free BDDs and Its Application to Pass-Transistor Logic Optimization

    Kazuyoshi TAKAGI  Hiroshi HATAKEDA  Shinji KIMURA  Katsumasa WATANABE  

     
    PAPER

      Vol:
    E82-A No:11
      Page(s):
    2407-2413

    In several design methods for Pass-transistor Logic (PTL) circuits, Boolean functions are expressed as OBDDs in decomposed form and then the component OBDDs are directly mapped to PTL cells. The total size of OBDDs (number of nodes) corresponds to the circuit size. In this paper, we investigate a method for PTL synthesis based on exact minimization of Free BDDs (FBDDs). FBDDs are well-studied extension of OBDDs with free variable ordering on each path. We present statistics showing that more than 56% of 616126 NPN-equivalence classes of 5-variable Boolean functions have minimum FBDDs with less size than their OBDDs. This result can be used for PTL synthesis as libraries. We also applied the exact minimization algorithm of FBDDs to the minimization of subcircuits in the synthesis for MCNC benchmarks and found up to 5% size reduction.

  • Algorithms for Evaluating the Matrix Polynomial I+A+A2+…+AN-1 with Reduced Number of Matrix Multiplications

    Kotaro MATSUMOTO  Kazuyoshi TAKAGI  Naofumi TAKAGI  

     
    PAPER-Algorithms and Data Structures

      Vol:
    E101-A No:2
      Page(s):
    467-471

    The problem of evaluating the matrix polynomial I+A+A2+…+AN-1 with a reduced number of matrix multiplications has long been considered. Several algorithms have been proposed for this problem, which find a procedure requiring O(log N) matrix multiplications for a given N. Among them, the hybrid algorithm based on the double-base representation of N, i.e., using mixed radices 2 and 3, proposed by Dimitrov and Cooklev is most efficient. It has been suggested by them that the use of higher radices would not bring any more efficient algorithms. In this paper, we show that we can derive more efficient algorithms by using higher radices, and propose several efficient algorithms.

21-25hit(25hit)